得益于深度学习的最新进展,如今存在复杂的生成工具,这些工具产生了极其现实的综合语音。但是,这种工具的恶意使用是可能的,有可能对我们的社会构成严重威胁。因此,合成语音检测已成为一个紧迫的研究主题,最近提出了各种各样的检测方法。不幸的是,它们几乎没有概括为在训练阶段从未见过的工具产生的合成音频,这使他们不适合面对现实世界的情况。在这项工作中,我们旨在通过提出一种仅利用说话者的生物特征的新检测方法来克服这个问题,而无需提及特定的操纵。由于仅在实际数据上对检测器进行训练,因此可以自动确保概括。建议的方法可以基于现成的扬声器验证工具实现。我们在三个流行的测试集上测试了几种这样的解决方案,从而获得了良好的性能,高概括能力和高度鲁棒性。
translated by 谷歌翻译
虚假图像的更高质量和广泛传播已经为可靠的法医制作产生了追求。最近已经提出了许多GaN图像探测器。然而,在现实世界的情景中,他们中的大多数都表现出有限的鲁棒性和泛化能力。此外,它们通常依赖于测试时间不可用的侧面信息,即它们不是普遍的。我们研究了这些问题,并基于有限的子采样架构和合适的对比学习范例提出了一种新的GaN图像检测器。在具有挑战性的条件下进行的实验证明了提出的方法是迈向通用GaN图像检测的第一步,确保对常见的图像障碍以及看不见的架构的良好概括。
translated by 谷歌翻译
近年来,对基于深度学习的粉丝彭化的兴趣日益增长。研究主要集中在建筑上。然而,缺乏基础事实,模型培训也是一个主要问题。一种流行的方法是使用原始数据作为地面真理训练在降低的分辨率域中的网络。然后在全分辨率数据上使用训练有素的网络,依赖于隐式缩放不变性假设。结果通常良好的分辨率,但在全分辨率下更具可疑的问题。在这里,我们向基于深度学习的泛散歌提出了一个全分辨率的培训框架。训练在高分辨率域中进行,仅依赖于原始数据,没有信息丢失。为了确保光谱和空间保真度,定义了合适的损耗,该损耗迫使泛圆柱输出与可用的全谱和多光谱输入一致。在WorldView-3,WorldView-2和Geoeye-1图像上进行的实验表明,在拟议的框架培训的方法中,在全分辨率数值指标和视觉质量方面都能保证出色的性能。该框架完全是一般的,可用于培训和微调任何基于深度学习的泛狼平网络。
translated by 谷歌翻译
Recent years have seen a proliferation of research on adversarial machine learning. Numerous papers demonstrate powerful algorithmic attacks against a wide variety of machine learning (ML) models, and numerous other papers propose defenses that can withstand most attacks. However, abundant real-world evidence suggests that actual attackers use simple tactics to subvert ML-driven systems, and as a result security practitioners have not prioritized adversarial ML defenses. Motivated by the apparent gap between researchers and practitioners, this position paper aims to bridge the two domains. We first present three real-world case studies from which we can glean practical insights unknown or neglected in research. Next we analyze all adversarial ML papers recently published in top security conferences, highlighting positive trends and blind spots. Finally, we state positions on precise and cost-driven threat modeling, collaboration between industry and academia, and reproducible research. We believe that our positions, if adopted, will increase the real-world impact of future endeavours in adversarial ML, bringing both researchers and practitioners closer to their shared goal of improving the security of ML systems.
translated by 谷歌翻译
Deep spiking neural networks (SNNs) offer the promise of low-power artificial intelligence. However, training deep SNNs from scratch or converting deep artificial neural networks to SNNs without loss of performance has been a challenge. Here we propose an exact mapping from a network with Rectified Linear Units (ReLUs) to an SNN that fires exactly one spike per neuron. For our constructive proof, we assume that an arbitrary multi-layer ReLU network with or without convolutional layers, batch normalization and max pooling layers was trained to high performance on some training set. Furthermore, we assume that we have access to a representative example of input data used during training and to the exact parameters (weights and biases) of the trained ReLU network. The mapping from deep ReLU networks to SNNs causes zero percent drop in accuracy on CIFAR10, CIFAR100 and the ImageNet-like data sets Places365 and PASS. More generally our work shows that an arbitrary deep ReLU network can be replaced by an energy-efficient single-spike neural network without any loss of performance.
translated by 谷歌翻译
Deep learning-based object detection is a powerful approach for detecting faulty insulators in power lines. This involves training an object detection model from scratch, or fine tuning a model that is pre-trained on benchmark computer vision datasets. This approach works well with a large number of insulator images, but can result in unreliable models in the low data regime. The current literature mainly focuses on detecting the presence or absence of insulator caps, which is a relatively easy detection task, and does not consider detection of finer faults such as flashed and broken disks. In this article, we formulate three object detection tasks for insulator and asset inspection from aerial images, focusing on incipient faults in disks. We curate a large reference dataset of insulator images that can be used to learn robust features for detecting healthy and faulty insulators. We study the advantage of using this dataset in the low target data regime by pre-training on the reference dataset followed by fine-tuning on the target dataset. The results suggest that object detection models can be used to detect faults in insulators at a much incipient stage, and that transfer learning adds value depending on the type of object detection model. We identify key factors that dictate performance in the low data-regime and outline potential approaches to improve the state-of-the-art.
translated by 谷歌翻译
Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e., probabilistic experiments) to study security properties in cryptography, some authors describe privacy inference risks in machine learning using a similar game-based style. However, adversary capabilities and goals are often stated in subtly different ways from one presentation to the other, which makes it hard to relate and compose results. In this paper, we present a game-based framework to systematize the body of knowledge on privacy inference risks in machine learning.
translated by 谷歌翻译
Semi-Supervised Learning (SSL) has recently accomplished successful achievements in various fields such as image classification, object detection, and semantic segmentation, which typically require a lot of labour to construct ground-truth. Especially in the depth estimation task, annotating training data is very costly and time-consuming, and thus recent SSL regime seems an attractive solution. In this paper, for the first time, we introduce a novel framework for semi-supervised learning of monocular depth estimation networks, using consistency regularization to mitigate the reliance on large ground-truth depth data. We propose a novel data augmentation approach, called K-way disjoint masking, which allows the network for learning how to reconstruct invisible regions so that the model not only becomes robust to perturbations but also generates globally consistent output depth maps. Experiments on the KITTI and NYU-Depth-v2 datasets demonstrate the effectiveness of each component in our pipeline, robustness to the use of fewer and fewer annotated images, and superior results compared to other state-of-the-art, semi-supervised methods for monocular depth estimation. Our code is available at https://github.com/KU-CVLAB/MaskingDepth.
translated by 谷歌翻译
A systematic review on machine-learning strategies for improving generalizability (cross-subjects and cross-sessions) electroencephalography (EEG) based in emotion classification was realized. In this context, the non-stationarity of EEG signals is a critical issue and can lead to the Dataset Shift problem. Several architectures and methods have been proposed to address this issue, mainly based on transfer learning methods. 418 papers were retrieved from the Scopus, IEEE Xplore and PubMed databases through a search query focusing on modern machine learning techniques for generalization in EEG-based emotion assessment. Among these papers, 75 were found eligible based on their relevance to the problem. Studies lacking a specific cross-subject and cross-session validation strategy and making use of other biosignals as support were excluded. On the basis of the selected papers' analysis, a taxonomy of the studies employing Machine Learning (ML) methods was proposed, together with a brief discussion on the different ML approaches involved. The studies with the best results in terms of average classification accuracy were identified, supporting that transfer learning methods seem to perform better than other approaches. A discussion is proposed on the impact of (i) the emotion theoretical models and (ii) psychological screening of the experimental sample on the classifier performances.
translated by 谷歌翻译
We extend best-subset selection to linear Multi-Task Learning (MTL), where a set of linear models are jointly trained on a collection of datasets (``tasks''). Allowing the regression coefficients of tasks to have different sparsity patterns (i.e., different supports), we propose a modeling framework for MTL that encourages models to share information across tasks, for a given covariate, through separately 1) shrinking the coefficient supports together, and/or 2) shrinking the coefficient values together. This allows models to borrow strength during variable selection even when the coefficient values differ markedly between tasks. We express our modeling framework as a Mixed-Integer Program, and propose efficient and scalable algorithms based on block coordinate descent and combinatorial local search. We show our estimator achieves statistically optimal prediction rates. Importantly, our theory characterizes how our estimator leverages the shared support information across tasks to achieve better variable selection performance. We evaluate the performance of our method in simulations and two biology applications. Our proposed approaches outperform other sparse MTL methods in variable selection and prediction accuracy. Interestingly, penalties that shrink the supports together often outperform penalties that shrink the coefficient values together. We will release an R package implementing our methods.
translated by 谷歌翻译